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ABSTRACT 


Our  project  has  initiated  and  will  develop  and  evaluate  a  new  Bayesian  approach  for  nuclear  test  monitoring.  We 
anticipate  that  the  new  approach  will  yield  substantially  lower  detection  thresholds,  possibly  approaching  a 
theoretical  lower  bound  that  we  hope  to  establish.  We  will  also  develop  new  techniques  to  implement  such 
monitoring  capabilities  within  a  general-purpose  Bayesian  modeling  and  inference  system  that  may  eventually 
support  a  wide  range  of  information- system  needs  for  arms  treaties. 

In  ongoing  work  that  is  moving  towards  possible  deployment,  we  have  completed  a  prototype  seismic  monitoring 
system  based  on  a  generative,  vertically  integrated  statistical  model  linking  hypothesized  events  to  “detections” 
extracted  from  raw  signal  data  by  commonly  used  algorithms.  On  test  data  sets  of  naturally  occurring  events  curated 
by  human  experts,  our  system  exhibits  roughly  60%  fewer  detection  failures  than  the  currently  deployed  automated 
system,  SEL3,  that  forms  part  of  the  International  Monitoring  System. 

The  current  phase  of  the  project  moves  away  from  hard- threshold  detections  altogether.  Instead,  the  generative 
model  spans  the  full  range  from  events  to  measured  signal  properties.  Given  the  observed  signal  traces,  the  statistical 
inference  algorithm  attempts  to  maximize  a  whole-network  statistical  measure  of  the  likelihood  that  an  event  -  or 
collection  of  events  -  has  occurred.  Specialized  techniques  such  as  waveform  matching  and  double  differencing  are 
realized  within  our  framework  as  special  cases  of  probabilistic  inference;  our  initial  experiments  using  2D  simulated 
data  indicate  that  a  full  Bayesian  analysis  can  provide  more  accurate  absolute  and  relative  locations  than  double 
differencing,  while  simultaneously  estimating  the  velocity  structure  of  the  observed  region. 

As  we  move  toward  a  full-scale  implementation,  the  primary  tasks  will  involve  the  development  of  accurate 
predictive  models  of  waveform  properties.  These  models  will  combine  both  parametric  forms  (for  example, 
triangular  envelopes  in  multiple  frequency  bands)  and  nonparametric  forms  based  on  previously  observed 
waveforms  from  nearby  events.  Hybrid  models  will  smoothly  interpolate  between  these  two  forms  depending  on  the 
distance  of  the  hypothesized  event  from  previously  observed  events. 
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OBJECTIVES 

Our  objective  is  to  realize  dramatic  improvements  in  sensitivity,  accuracy,  and  robustness  of  global  monitoring 
systems  for  nuclear  tests  through  a  novel  Bayesian  approach  to  whole-network  data  analysis. 

The  project  will  develop  a  new  mathematical  and  computational  approach  to  the  analysis  of  the  sensor  data  collected 
from  global  monitoring  systems.  The  approach  involves  the  application  of  rigorous  Bayesian  statistical  analysis  to 
the  entire  monitoring  problem  and  requires  the  development  of  a  complete,  vertically  integrated,  empirically 
validated,  generative  statistical  model  of  event  occurrence,  signal  propagation,  and  sensing,  as  well  as  efficient  and 
provably  correct  inference  algorithms  for  extracting  the  most  likely  event  history  and/or  a  posterior  distribution  over 
event  histories  from  the  measured  sensor  data.  At  a  fundamental  level,  we  hope  to  gain  a  much  deeper  and  more 
accurate  understanding  of  the  limits  of  detectability  than  is  given  by  a  classical  per-station  SNR  analysis.  We  expect 
to  realize  the  following  benefits:  (1)  Substantially  more  accurate  and  sensitive  detection  and  localization  of  events, 
particularly  at  lower  magnitudes.  (2)  A  monitoring  software  architecture  that  allows  straightforward  incorporation  of 
multiple  sensor  modalities  (including  new  modalities  as  they  arise)  and  new  and  improved  physics-based  models 
such  as  source  models  and  phase  velocity  and  attenuation  models.  (3)  Extensibility  to  other  monitoring  problems 
arising  in  treaty  verification. 


RESEARCH  ACCOMPLISHED 

The  period  of  performance  for  the  DTRA-funded  project  has  not  yet  begun  at  the  time  of  writing;  therefore,  this 
report  covers  some  background  material,  our  relevant  past  work,  some  initial  steps  taken  for  the  new  project,  and  a 
brief  summary  of  our  planned  activities. 

Background:  Bayesian  monitoring 

The  proposed  research  is  motivated  by  a  fundamental  problem  with  current  approaches  to  seismic  monitoring, 
namely,  the  problem  of  robust  detection  of  signals  with  low  signal-to-noise  ratio  (SNR).  Standard  detection  practice 
is  to  declare  a  detection  when  the  average  amplitude  in  a  short  time  window  increases  above  a  predefined  multiple 
of  average  amplitude  in  a  long  time  window  (a.k.a.  short-term  average/long-term  average,  STA/LTA).  Independent 
algorithms  then  attempt  to  associate  detections  to  seismic  events.  To  avoid  creating  a  flood  of  spurious  events,  a 
very  high  SNR  threshold  is  typically  used  for  detection.  This  approach  is  flawed  in  at  least  three  important  ways. 
First,  it  ignores  the  absence  of  detections  in  evaluating  a  proposed  event.  Second,  it  fails  to  take  advantage  of 
spatiotemporally  correlated  signals  at  multiple  stations  to  detect  that  an  event  has  occurred.  Third,  it  discards  details 
of  the  signal  that  support  more  accurate  estimation  of  event  properties  via  techniques  such  as  waveform  matching. 

Our  proposed  approach  is  based  on  Bayesian  inference,  a  method  well-established  in  many  areas  of  science  that 
solve  inverse  problems  including  seismology  itself  (Duijndam,  1988ab).  They  have  been  very  effective  in 
tomographic  applications  (Taylor  et  al.,  2001;  Simmons  et  al.,  2010b)  and  event  localization  (Myers  et  ah,  2007, 
2009)  and  have  been  applied  to  the  general  beamforming  problem  (Bell  et  al.,  2000).  Our  work  (Russell  et  ah,  2009, 
2010;  Arora  et  ah,  2009,  2010)  is  to  our  knowledge  the  first  complete  Bayesian  monitoring  system,  solving  both 
association  and  localization  problems  within  a  unified  probability  model. 

In  general,  Bayesian  inference  yields  a  posterior  probability  distribution  over  a  set  of  hypotheses  X  given  some 
evidence  Y  =  y.  In  seismic  monitoring,  a  hypothesis  is  a  collection  of  events  occurring  over  space  and  time;  the 
evidence  consists  of  the  raw  seismic  (and  other)  waveform  signals  from  all  sensors  over  the  time  period  of  interest. 
The  inference  process  is  based  on  a  probability  distribution  or  model  with  two  components: 

•  The  prior  probability  distribution  Po(X)  over  hypotheses;  for  the  monitoring  problem,  this  includes  the 
natural  seismicity  distribution  and  a  probability  distribution  for  chemical  and  nuclear  explosions  on  or  near 
the  Earth’s  surface.  The  subscript  # represents  the  parameters  of  this  distribution,  which  can  be  input  as 
prior  information  and  learned  from  empirical  data. 

•  The  conditional  probability  distribution  P(!,(Y=y  \  X)  for  the  evidence  given  each  possible  hypothesis;  in 
our  case,  this  part  of  the  model  describes  how  signals  propagate  through  the  earth  and  how  they  are 
detected  by  sensors,  as  well  as  the  ways  in  which  noise  signals  arise.  The  subscript  0  represents  the 
parameters  of  this  distribution;  again,  these  can  be  learned  from  empirical  data,  but  there  is  also  a  great  deal 
of  physics-based  knowledge  that  constrains  the  possible  values.  In  seismology,  the  model  of  propagation  is 
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often  called  the  forward  model ,  although  it  is  important  to  note  that  in  the  Bayesian  context  it  includes  a 
quantitative  characterization  of  uncertainty. 

Bayes’  rule  simply  multiplies  these  two  components  together  to  give  the  posterior  probability  distribution  over  the 
set  of  hypotheses,  given  the  available  evidence: 

P(X  |  Y=y)  =  aP^Y=y\X)Pe(X) 

where  a  is  a  normalizing  constant.  To  the  extent  that  the  prior  and  conditional  distributions  correctly  describe 
knowledge  of  seismicity,  propagation,  and  so  on,  the  posterior  distribution  represents  an  optimal  inference  from  the 
available  data.  An  inference  algorithm  may  compute  a  single  most  likely  explanation  or  MLE  hypothesis,  instead  of 
the  full  posterior  distribution  over  hypotheses.  Because  there  are  infinitely  many  possible  hypotheses  (each  a  set  of 
seismic  events),  the  calculations  involved  are  nontrivial  and  require  efficient  inversion  of  the  forward  model. 

As  a  side  effect  of  the  inference  process,  the  Bayesian  approach  generates  information  that  can  be  used  to 
continuously  adapt  the  model  parameters  6  and  (f>  to  better  explain  the  data.  This  adaptation  requires  no  “ground 
truth”  (unlike  supervised  learning  methods)  and  hence  allows  for  continuous  self-calibration  and  sensor  diagnostics. 


Background:  Detection-based  Bayesian  monitoring 

When  the  raw  data  y  are  complicated  -  as  in  the  case  of  complete  seismic  waveforms  -  it  is  common  for  Bayesian 
methods  to  deal  with  a  simplified  representation  f(y)  of  the  raw  data.  Here / can  be  any  deterministic  function.  In 
detection-based  Bayesian  monitoring ,/ represents  the  signal  processing  algorithms  that  are  used  to  pull  out 
detections  from  the  signal  traces  at  each  station.  That  is ,f(y)  is  the  set  of  all  detections  (with  associated 
characteristics)  from  all  stations  during  the  period  of  interest.  The  performance  of  a  detection-based  Bayesian 
monitoring  system  is  limited  by  the  underlying  detection  algorithm  f  Such  an  algorithm  -  for  example,  thresholding 
based  on  STA/LTA  -  may  fail  to  detect  some  arriving  signals  while  generating  spurious  detections  and  abstracting 
away  important  details  of  true  signals.  Nonetheless,  as  we  show  below,  the  performance  of  a  Bayesian  monitoring 
system  may  be  significantly  better  than  current  systems  that  utilize  the  same  detection  algorithms. 

NET- VIS  A  (Arora  et  al.,  2010,  Arora  et  al.,  201 1)  consists  of  a  collection  of  probability  distributions  and  an 
inference  algorithm  that  computes  an  MLE  hypothesis.  The  probability  distributions  include: 

•  A  prior  distribution  Pfe)  over  sets  of  events  e  for  a  given  interval  T.  Each  event  is  defined  by  time, 
location,  depth,  and  magnitude.  Event  times  are  distributed  according  to  a  time-homogeneous  Poisson 
process.  (This  is  easily  modified  to  allow  for  aftershocks.)  Magnitudes  are  distributed  according  to  the 
Gutenberg-Richter  distribution.  Locations  for  naturally  occurring  events  follow  a  distribution  estimated 
from  historical  data,  whereas  man-made  events  have  a  spatially  uniform  distribution  with  near  zero  depth. 

•  For  each  station  and  each  true  phase,  a  detection  probability  distribution  is  computed  given  the  event 
magnitude,  depth,  and  distance.  The  probability  model  (logistic  regression  with  local  input  features)  is 
calibrated  from  historical  data  that  document  which  events  were  detected  by  which  stations.  (Figure  1(a)). 
Given  that  wave  propagation  is  similar  from  a  particular  event  region  to  each  station  for  both  earthquakes 
and  explosions,  there  is  good  reason  to  use  the  numerous  earthquakes  to  characterize  the  detection  statistics 
of  rare  explosions. 

•  For  each  detection,  the  following  predictive  distributions  for  attributes  of  the  detection  (at  that  station) 
given  attributes  of  the  event: 

o  A  distribution  for  the  predicted  arrival  time,  given  by  the  event  time  plus  a  travel-time  distribution 
for  the  true  phase  from  the  event  location  to  the  station.  NET- VIS  A  currently  uses  the  IASPEI91 
1-D  model  with  station- specific  corrections,  i.e.,  the  same  model  as  the  IMS.  Uncertainty  is 
modeled  as  a  Laplacian  residual,  as  illustrated  in  Figure  1(b),  that  includes  pick  error. 

o  A  distribution  for  the  measured  amplitude  given  the  event  magnitude  and  distance  -  i.e.,  an 
attenuation  model.  As  a  starting  point,  the  log  amplitude  is  assumed  to  decay  linearly  with 
distance,  with  Gaussian  error.  This  model  ignores  known  deviations  from  simple  geometrical 
spreading  and  attenuation  models,  which  we  expect  to  incorporate  as  the  project  progresses. 
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o  A  multinomial  distribution  for  the  measured  phase  label  given  the  true  phase  label,  again 
estimated  from  historical  data. 

o  Distributions  for  the  measured  azimuth  and  slowness  given  the  true  azimuth  and  slowness  for  the 
event  also  follow  a  Laplacian  distribution. 

•  Finally,  for  each  station  a  time -homogeneous  Poisson  for  spurious  detections,  each  with  its  amplitude 
drawn  from  an  empirically  estimated  distribution  (a  mixture  of  Gaussians),  its  phase  label  from  an 
empirically  estimated  multinomial  distribution,  and  its  azimuth  and  slowness  from  uniform  distributions. 
The  assumption  of  uniformity  for  azimuth  and  slowness  is  questioned  by  Koper  et  al.  (2010)  and  is  valid 
only  for  a  subset  of  stations  in  our  data. 


Distance  (deg)  Time 


Figure  1.  (a)  P-phase  detection  probability  at  ASAR  for  magnitude  3.5  events,  as  a  function  of  distance,  (b) 
Laplacian  fit  to  ASAR  travel-time  residual  with  respect  to  iasp91  predictions  with  station-specific 
corrections.  Note  the  bias  of  0.6  seconds  even  after  station-specific  correction;  our  model 
automatically  compensates  for  this  bias. 

NET-VISA  is  currently  being  evaluated  with  a  view  to  eventual  deployment  by  the  CTBTO.  The  evaluation 
compares  the  output  of  NET-VISA  to  the  SEL3  bulletin  (the  final  automated  bulletin)  produced  by  the  CTBTO.  For 
the  purposes  of  this  comparison,  “ground  truth”  was  defined  by  the  LEB  (Late  Event  Bulletin),  which  is  generated 
by  post-processing  the  SEL3  output  by  a  team  of  expert  analysts.  The  LEB  itself  is  imperfect,  but  it  provides  a 
baseline  of  performance  for  a  labor-intensive  monitoring  system.  The  distribution  parameters  for  NET- VIS  A  were 
determined  based  on  2.5  months  of  IMS  data  and  tests  were  run  on  one  week  of  held-out  data.  Details  of  the 
evaluation  appear  in  the  paper  by  Le  Bras  et  al.  (201 1)  in  these  proceedings.  In  summary,  the  low-magnitude  (i.e., 
below  4)  event-detection  failure  rate  for  NET- VIS  A  is  2.5-3  times  lower  than  that  for  SEL3,  as  shown  in  Figure 
2(a).  This  improvement  is  achieved  despite  the  fact  that  the  “ground  truth”  LEB  is  derived  from  SEL3  and  is 
therefore  perhaps  biased  towards  SEL3’s  interpretation  of  the  data.  As  illustrated  in  Figure  2(b),  NET-VISA  also 
finds  events  recorded  by  local  networks  (e.g.,  NEIC,  JMA,  PRU,  NNC)  that  do  not  appear  in  LEB.  Finally,  NET- 
VISA  was  tested  on  one  week  of  data  that  included  the  DPRK  test  of  May  25,  2009;  NET-VISA  formed  the  event 
correctly  with  an  accurate  location  based  on  associating  53  detections  from  the  IMS,  of  which  50  were  also 
associated  in  the  LEB.  (SEL3  included  only  39  detections  for  the  event.) 
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Figure  2.  (a)  Comparative  detection  failure  rate  for  SEL3  and  NET-VISA,  for  a  test  set  of  852  events  (1 
week).  Rates  are  shown  for  all  events  and  for  specific  magnitude  ranges,  (b)  Posterior  location 
density  calculated  by  NET-VISA  for  NEIC  Event  12054557  (Colorado)  using  detections  from  three 
IMS  stations  (TXAR,  PDAR,  and  ANMO),  with  the  most  likely  location  shown  by  the  blue  square. 
The  white  star  shows  the  NEIC  location  using  19  stations;  the  red  circle  shows  the  nearest  SEL3 
location  (approx.  10  degrees  away);  the  event  was  not  recorded  in  LEB. 


Signal-based  Bayesian  monitoring 

Detection-based  systems  such  as  NET- VIS  A  are  fundamentally  limited  by  the  ability  of  the  underlying  signal 
detection  algorithm.  Our  current  project  will  develop  the  required  statistical  models  and  algorithms  for  a  signal- 
based  Bayesian  monitoring  system ,  which  takes  as  its  observed  data  the  network-wide  raw  signal  (waveform)  data 
itself,  thereby  avoiding  the  need  for  hard- threshold  local  detection  algorithms.  Whereas  NET- VIS  A  has  established 
the  feasibility  and  credibility  of  Bayesian  monitoring,  SIG-VISA  will  be  a  qualitatively  different  and  technically 
more  sophisticated  approach  requiring  fundamental  advances  in  understanding.  The  basic  relationship  between 
NET- VIS  A  and  SIG-VISA  is  shown  in  Figure  3 . 
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Figure  3.  SEL3,  NET-VISA,  and  SIG-VISA.  SIG-VISA  will  make  direct  use  of  waveforms,  whereas  SEL3  and 
NET- VIS  A  make  use  of  a  detection  time  series. 


Signal-based  Bayesian  monitoring  has  several  advantages  over  detection-based  approaches: 

•  Signal-based  Bayesian  monitoring  is  potentially  more  sensitive  to  low-SNR  signals  because  it  uses  Bayesian 
inference  to  determine  whether  or  not  signal  characteristics  at  all  stations  are  consistent  with  the  occurrence 
of  an  event  at  a  given  location  and  magnitude.  Roughly  speaking,  signal-based  Bayesian  monitoring  improves 
over  detection-based  methods  in  much  the  same  way  that  array  stations  improve  over  single  seismometers,  by 
summing  the  signal  recorded  at  multiple  sensors  to  increase  SNR. 
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•  Signal-based  Bayesian  monitoring  can  also  take  advantage  of  correlations  in  waveform  properties  across 
multiple  signals,  as  used  in  waveform  matching  algorithms  such  as  double-differencing  (Waldhauser  and 
Ellsworth,  2000;  Schaff  and  Richards,  2004). 

•  If  signals  from  two  events  overlap  at  a  given  station,  a  detection-based  system  may  record  only  a  single,  large 
detection,  which  may  be  incompatible  with  the  locations  and  magnitudes  of  both  events.  In  the  signal-based 
approach,  the  overlapping  signal  is  expected  and  may  in  some  cases  be  separated  into  its  components  by  the 
inference  process  (given  time-separated  signals  from  at  least  one  other  station).  Stone  et  al.  (1999)  cite  a 
similar  advantage  for  their  “unified  tracking”  approach. 

We  report  here  on  two  aspects  of  the  problem:  formulation  of  spatially  correlated  waveform  models  and  connections 
between  Bayesian  and  other  “joint”  methods  such  as  double-differencing  and  joint  hypocenter  determination. 

Waveform  models:  Preliminary  proposal 

The  first  research  challenge  lies  in  formulating  a  complete  and  consistent  probability  model  that  supports  the 
computation  of  the  likelihood  of  observing  the  measured  signal  traces,  given  any  hypothesized  collection  of  events. 
Fitting  waveform  envelopes  is  well  established  (e.g.,  Mayeda  et  al.,  2004),  and  the  simplest  approach  involves  a 
low-dimensional  parametric  envelope  descriptor  (cf.  Huseby  et  al.,  1998)  (e.g.,  triangular  or  paired-exponential) 
whose  arrival  time,  amplitude,  and  spread  depend  on  the  distance  and  magnitude  of  the  event;  an  example  of  a 
paired-exponential  envelope  template  is  shown  in  Figure  4. 


Figure  4.  A  paired-exponential  envelope  template  (green)  superimposed  on  an  actual  waveform  envelope 
(red). 


Actual  envelopes  are  not  well-modeled  by  “template  plus  i.i.d.  noise”;  typically,  they  exhibit  more  temporally 
correlated  “macro-variation”  resulting  from  path  effects.  Furthermore,  envelope  (or  waveform)  shape  is  highly 
repeatable  across  events  with  the  same  location  and  type  -  this  is  the  basis  for  waveform  cross-correlation.  To 
achieve  these  properties  we  adopt  a  hybrid  parametric/nonparametric  model  in  which  the  realized  envelope  for  a 
given  event  is  the  product  of  a  parametric  mean  envelope  (magnitude  and  distance  dependent)  and  a  stochastic 
modulation  signal.  A  simple  modulation  model  might  be  based  on  a  random  linear  combination  of  Fourier  basis 
functions  (see  Figure  5).  Ultimately  we  will  learn  a  model  from  historical  data. 


Figure  5.  A  sampled  envelope  (red)  formed  from  a  parametric  template  (green)  multiplied  by  a  random  linear 
combination  of  Fourier  basis  functions. 


A  basic  model  of  envelope  variation  would  assign  each  event  an  independently  sampled  log-modulation  signal, 
using  a  Gaussian  prior  on  the  basis  coefficients.  A  nonparametric  extension  adopts  a  Gaussian  process  model  for  the 
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basis  function  parameters.  This  captures  correlations  among  event  envelopes  that  decay  with  distance  (analogous  to 
correlation  matching );  an  illustrative  example  is  shown  in  Figure  6. 


Figure  6:  (a)  A  reference  envelope  (red)  and  a  sampled  envelope  (blue)  from  the  Gaussian-process  generative 
model  for  a  nearby  event  location,  (b)  The  same  reference  envelope  (red)  and  a  sampled  envelope 
(blue)  from  a  distant  event  location.  Here,  “nearby”  and  “distant”  are  relative  to  the  Gaussian 
process  correlation  distance. 


Bayesian  localization,  joint  hypocenter  determination,  and  double-differencing 

Accurately  inverting  event  hypocenters  is  made  difficult  by  velocity  heterogeneity,  which  introduces  correlations  to 
the  travel-time  residuals.  This  issue  has  been  addressed  by  the  simultaneous  inversion  of  hypocenters  with  station- 
specific  corrections  (e.g.,  Douglas,  1967),  parameterized  adjustments  to  the  travel  time  prediction  (Myers  et  al ., 
2007,  2009,  201 1),  or  with  velocity  structure  (e.g.,  Thurber,  1983;  Zhang  &  Thurber,  2003),  and  by  difference-based 
methods  such  as  double-differencing  (Waldhauser  and  Ellsworth,  2000).  The  latter  approach  has  the  advantage  of 
naturally  integrating  differential  arrival  time  data  obtained  through  waveform  matching  techniques  such  as  cross¬ 
correlation;  differences  obtained  in  this  way  are  more  precise  than  those  obtained  by  subtracting  picked  arrival  times 
and  so  provide  higher- quality  location  estimates  (Schaff  and  Richards,  2004).  We  have  developed  a  Bayesian 
method  for  the  joint  inversion  of  event  hypocenters  that  accounts  for  correlated  travel-time  residuals  as  a  fully 
integrated  part  of  the  probabilistic  inference.  We  refer  to  this  method  as  Bayesian  double-differencing  (BDD), 
because  it  resembles  the  method  of  double-differencing  both  in  the  sense  of  correcting  for  velocity  heterogeneity 
and  in  its  ability  to  exploit  waveform-based  relative  arrival  times.  Unlike  double-differencing  and  related 
techniques,  we  solve  for  absolute  as  well  as  relative  locations,  similar  to  Myers  et  al.  (2007,  2009,  2011). 

Model:  Our  approach  models  the  slowness  (inverse  velocity)  field  as  a  Gaussian  process  (GP).  In  a  GP,  values  at 
any  pair  of  points  are  jointly  Gaussian,  with  a  covariance  kernel  that  typically  depends  on  the  distance  between  the 
points.  In  standard  GP  regression  (a.k.a.  kriging ),  the  posterior  random  field  is  inferred  based  on  observations  at  a 
finite  set  of  points;  in  our  case,  however,  we  do  not  observe  the  slowness  values  directly,  and  our  goal  is  not  to 
estimate  the  slowness  field  itself  (although  a  posterior  estimate  is  obtainable  from  the  inference  process).  Instead,  we 
make  use  of  the  fact  that  a  GP  slowness  field  induces  a  GP  residual  field  at  each  station,  with  the  covariance  kernel 
of  each  residual  field  given  by  a  double  integral  along  event-station  paths  of  the  slowness  field  covariance  kernel 
(Rodi  and  Myers,  2007).  Because  these  station-specific  residual  fields  are  all  induced  by  the  same  underlying  GP 
slowness  field,  the  travel-time  residuals  themselves  are  jointly  Gaussian  across  stations.  (Note  that  unlike  difference 
methods,  which  consider  the  relationships  between  event  residuals  independently  at  each  station,  our  model  shares 
information  naturally  between  nearby  stations.) 

The  joint  distribution  on  the  residuals  given  the  event  locations 
constitutes  a  forward  model,  which  we  combine  with  a  prior 
distribution  on  event  locations  and  invert  using  Bayes’  rule  to  yield  a 
posterior  distribution  on  locations  conditioned  on  arrival  times.  Of 
course,  the  true  arrival  times  are  not  observed;  we  must  extend  the 
model  to  reflect  sensor  noise  in  the  form  of  “pick  errors.”  This  is 
accomplished  by  an  additional  Gaussian  noise  term  that  is  added 
independently  for  each  observation  as  part  of  the  forward  model. 

It  is  similarly  possible  to  incorporate  arrival-time  differences  obtained 
by  waveform  matching:  these  differences  are  represented  in  the 
forward  model  as  the  differences  in  the  true  arrival  times,  plus  a 
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Gaussian  noise  term  (again  independent  for  each  difference  observation)  with  lower  variance  than  that  of  the  pick 
error  to  reflect  the  higher  precision  attained  by  waveform  matching.  The  Bayesian  inversion  seamlessly  makes  use 
of  both  the  absolute  time  constraints  provided  by  picked  arrival  times  along  with  the  relative  time  constraints  of  the 
difference  observations  when  available;  the  lower  variance  ascribed  to  difference  observations  causes  them  to  be 
treated  as  stronger  constraints,  analogously  to  the  higher  weighting  commonly  given  to  these  constraints  in  double- 
difference  methods  (also  see  Myers  et  al.,  2011,  these  proceedings). 


Figure  8.  (a)  Simulated  2-D  event  clusters  located  using  BDD  (+)  along  with  their  true  locations  (x).  Stations 
are  positioned  at  the  four  corners  and  each  event  is  observed  by  three  of  the  four  stations.  The 
background  shows  the  MLE  slowness  field  given  the  inferred  locations,  (b)  The  true  slowness  field, 
with  true  event  locations  marked,  (c)  Variance  of  the  slowness  field  posterior  (darker  is  higher). 

Inference:  The  posterior  distribution  over  the  event  locations  themselves  has  no  simple  analytic  form  (Figure  7 
shows  an  example),  so  the  inversion  is  nontrivial  to  perform.  We  instead  apply  Markov  Chain  Monte  Carlo 
(MCMC),  a  standard  technique  for  Bayesian  inference,  to  sample  from  the  posterior  and  estimate  the  expected 
locations  of  the  joint  set  of  events  (e.g.  Lomax  et  al.,  2000;  Myers  et  al.,  2007).  This  is  time-consuming  but  tractable 
for  the  small-scale  simulations  we  have  considered  thus  far. 

Although  the  goal  of  the  inference  is  to  produce  accurate  locations,  the  model  also  defines  a  posterior  distribution 
over  the  slowness  field,  which  is  simple  to  calculate  given  a  point  estimate  of  the  event  locations.  Figures  8(a)  and 
8(b)  compare  a  slowness  field  inferred  from  12  events  to  the  true  slowness  field,  showing  that  our  method 
effectively  recovers  the  tomography  while  inverting  for  event  locations.  In  this  sense  it  resembles  other  tomographic 
approaches  (e.g.  Thurber,  1983;  Zhang  &  Thurber,  2003),  but  maintains  a  full  posterior  distribution  over  the 
tomography  instead  of  a  point  estimate,  similar  to  Shapiro  and  Ritzwoller  (2004),  Pasyanos  et  al.,  (2006)  and  Hauser 
et  al.,  (201 1).  As  illustration,  Figure  8(c)  shows  the  variance  of  the  slowness  field  posterior;  note  that  we  are  most 
confident  of  the  tomography  along  ray  paths  between  events 
and  nearby  stations  (located  at  the  four  corners). 

Experiments:  We  have  conducted  initial  simulation 
experiments  with  clustered  events  detected  by  three  of  four 
stations  and  arrival  times  sampled  from  the  Gaussian  process 
prior.  These  experiments  did  not  make  use  of  waveform 
matching.  Several  distance-weighting  strategies  for  double- 
differencing  were  evaluated,  including  the  one  described  by 
Waldhauser  &  Ellsworth  (2000);  the  results  presented  in 
Figure  9  are  from  the  best-performing  strategy,  which  sets 
the  weight  of  each  difference  residual  constraint  equal  to  the 
prior  covariance  of  the  two  corresponding  residuals  in  the 
Bayesian  model.  As  Figure  9  shows,  the  Bayesian  method 
significantly  outperformed  both  double-differencing  as  well 
as  a  simple  independent  inversion  of  the  event  locations, 
both  in  the  absolute  location  error  and  in  the  relative 
locations  between  events. 


g 

UJ 


3- 

V) 


§  oto 


I  Absolute 
Location  Erroi 
3  Relative 
Location  Erroi 


Indep 


Figure  9:  MSE  in  absolute  and  relative  locations 
(22  simulation  runs,  5  events  per  run). 
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CONCLUSIONS  AND  RECOMMENDATIONS 


Prior  results  suggest  that  Bayesian  monitoring  is  a  promising  technique  for  analyzing  streams  of  parametric 
detections  from  multiple  stations  to  form  a  global  event  bulletin  and  may  be  preferable  to  existing  deployed  methods 
for  global  association.  Our  current  work,  in  its  very  early  stages,  is  aimed  at  extending  Bayesian  monitoring  with 
generative  models  of  waveform  envelopes  to  improve  detection  and  association  and  performing  joint  inference  of 
event  locations  and  slowness  fields  to  improve  localization  accuracy. 
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